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We introduce explicit schemes based on the polarization phenomenon for the tasks of one- 
way secret key agreement from common randomness and private channel coding. For the 
former task, we show how to use common randomness and insecure one-way communication 
to obtain a strongly secure key such that the key construction has a complexity essentially 
f^ ■ linear in the blocklength and the rate at which the key is produced is optimal, i.e., equal to 

^^ \ the one-way secret-key rate. For the latter task, we present a private channel coding scheme 

»vj ' that achieves the secrecy capacity using the condition of strong secrecy and whose encoding 

, \ and decoding complexity are again essentially linear in the blocklength. 
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I. INTRODUCTION 

Consider two parties, Alice and Bob, connected by an authentic but otherwise fully insecure 
communication channel. It has been shown that without having access to additional resources, it 
O \ is impossible for them to carry out information-theoretically secure private communication [1, 2]. 

In particular they are unable to generate an unconditionally secure key with which to encrypt mes- 
sages transmitted over the insecure channel. However, if Alice and Bob have access to correlated 
randomness about which an adversary (Eve) has only partial knowledge, the situation changes com- 

l/^ ' pletely: information-theoretically secure secret-key agreement and private communication become 

^5 ■ possible. Alternatively, if Alice and Bob are connected by a noisy discrete memoryless channel 

(DMC) to which Eve has only limited access — the so-called wiretap channel scenario of Wyner [3], 
Csiszar and Korner [1], and Maurer [2] — private communication is again possible. 

ff^ ' In this paper, we present explicit schemes for efficient one-way secret-key agreement from com- 

mon randomness and for private channel coding in the wiretap channel scenario. Our schemes are 
based on polar codes, a family of capacity-achieving linear codes, introduced by Arikan [5], that can 

^> ■ be encoded and decoded efficiently. Previous work by us in a quantum setup [(}] already implies 

^ . that practically efficient one-way secret-key agreement and private channel coding in a classical 

setup is possible, where a practically efficient scheme is one whose computational complexity is 
essentially linear in the blocklength. The aim of this paper is to explain the schemes in detail and 
give a purely classical proof that the schemes are reliable, secure, practically efficient and achieve 
optimal rates. Section II introduces the problems of performing one-way secret-key agreement and 
private channel coding. We summarize known and new results about the optimal rates for these two 
problems for different wiretap channel scenarios. In Section III, we explain how to obtain one-way 
secret-key agreement that is practically efficient, strongly secure, reliable, and achieves the one- 
way secret-key rate. However, we are not able to give an efficient algorithm for code construction. 
Section IV introduces a similar scheme that can be used for strongly secure private channel coding 
at the secrecy capacity. Finally in Section V, we state two open problems that are of interest in 
the setup of this paper as well as in the quantum mechanical scenario introduced in [G]. 
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II. BACKGROUND AND CONTRIBUTIONS 

A. Notation and Definitions 

Let [k] = {l,...,k} for k £ Z+. For x G Z^ and X C [k] we have x[I] = [xi : i £ I], 
X* = [xi, . . . ,Xi] and x* = [xj,...,Xi] for j < i. The set A^ denotes the complement of the 
set A. The uniform distribution on an arbitrary random variable X is denoted by Px- For 
distributions P and Q over the same alphabet X, the variational distance is defined by 5{P, Q) := 
\ Z^xeA" l-^(^) ~ Q{^)\- The notation X^:>-Y—o—Z means that the random variables X,Y,Z form 
a Markov chain in the given order. 

In this setup we consider a discrete memoryless wiretap channel (DM-WTC) y\l : X ^ y x Z, 
which is characterized by its transition probability distribution Py,z\x- We assume that the variable 
X belongs to Alice, Y to Bob and Z to Eve. 

According to Korner and Marton [7], a DM-WTC \N : X ^ y x Z is termed more capable 
if I{X] Y) > I{X; Z) for every possible distribution on X. The channel W is termed less noisy 
if I{U;Y) > I{U;Z) for every possible distribution on {U,X) where U has finite support and 
C/^^X^^(y, Z) form a Markov chain. If X^^Y^:^Z form a Markov chain, W is called degraded. 
It has been shown [ ] that being more capable is a strictly weaker condition than being less noisy, 
which is a strictly weaker condition than being degraded. Hence, having a DM-WTC W which is 
degraded implies that W is less noisy, which again implies that W is also more capable. 

B. Polarization Phenomenon 

Let X be a vector whose entries are i.i.d. Bernoulli(p) distributed for p G [0, 1] and N = 2^ 
where n G Z"*". Then define U^ = G^X^ , where Gn denotes the polarization (or polar) transform 
which can be represented by the matrix 



O. := [^, jj . (1) 

where A®'' denotes the kth Kronecker power of an arbitrary matrix A. Furthermore, let Y^ = 
\N^X^, where W^ denotes A^ independent uses of a DMC V\l : X ^ y. For e G (0, 1) we may 
define the two sets 

n^{X\Y):={ie[N]:H{Ui\W~\Y^)>l-e} and (2) 

V^{X\Y) := {i G [N] : H{Ui\U'-^ ,Y^) < e} . (3) 

The former consists of outputs Uj which are essentially uniformly random, even given all previous 
outputs U^~^ as well as Y^ , while the latter set consists of the essentially deterministic outputs. 
The polarization phenomenon is that essentially all outputs are in one of these two subsets, and 
their sizes are given by the conditional entropy of the input X given Y. 

Theorem 1 (Polarization Phenomenon [^', "]). For any e G (0, 1) 

\n^{X\Y)\=NH{X\Y)-o{N) and \V^ {X\Y)\ = N {I - H {X\Y )) - o{N) . (4) 

Based on this theorem it is possible to construct a family of linear error correcting codes, called 
polar codes, that have several desirable attributes [5, 9-11]: they provably achieve the capacity 
of any DMC; they have an encoding and decoding complexity that is essentially linear in the 
blocklength N; the error probability decays exponentially in the square root of the blocklength. 



Correlated sequences of binary random variables may be polarized using a multilevel construc- 
tion, as shown in [']} Given M i.i.d. instances of a sequence X = [X^^i-^, X(^2)t ■ ■ ^{K)) and possibly 
a correlated random variable Y, the basic idea is to first polarize X/j( relative to Y , then treat 
X^Y^ as side information in polarizing XK, and so on. More precisely, defining C/M 
for j = 1, . . . ,K, we may define the random and deterministic sets for each j as 



GmX- 



K:u)iXu)\Xu-i)r-- ,X^,),Y) = {^ G [M] : //[/(,.) 



■vM 



u:7:,x, 



M 



Pi'f(^.)(X(,.)|X(,._i),--- ,X(i),y) = {ie [M] : H(U^ 



'uu 



(J) 

Tji — 1 yM 






>l-e}, (5) 
< e}. (6) 



In principle we could choose different e parameters for each j, but this will not be necessary here. 
Now, Theorem 1 applies to the random and deterministic sets for every j. The sets TZ^ {X\Y) 
{7^J^.)(X(,■)|X(,_l),...,X(l),y)}f=l and V^'\X\Y) = {V^ 
sizes given by 



e,{J) (^(J) 1^0-1) ' • • • ' ^(1) ' ^)}j=i have 



K 

= E^^(^o-)l^(i)'--- 

= MH{X\Y) -o{KM), 



• X^i),Y) 



^(i-l)'^y 



o(M) 



and 



= M{K - H{X\Y)) - o{KM). 



(7) 

(8) 
(9) 

(10) 

(11) 
(12) 



In the following we will make use of both the polarization phenomenon in its original form. Theo- 



rem 1, and the multilevel extension. To simplify the presentation, we denote by G^j the K parallel 
applications of Gm to the K random variables XK. 



C. One- Way Secret-Key Agreement 



At the start of the one-way secret-key agreement protocol, Alice, Bob, and Eve share N = 2", 
n € Z"*" i.i.d. copies (X^,y^,Z^) of a triple of correlated random variables {X,Y,Z) which take 
values in discrete but otherwise arbitrary alphabets X, y, Z? 

Alice starts the protocol by performing an operation ta '■ ^ — )• (5 ,C) on X which outputs 
both her secret key S'^ G S"^ for S = {0, 1} and an additional random variable C G C which she 
transmits to Bob over an insecure but noiseless public channel. Bob then performs an operation 
tb '■ {y^ ,C) — ?• S"^ on Y^ and the information C he received from Alice to obtain a vector 5^ G S"^; 
his secret key. The secret-key thus produced should be reliable, i.e., satisfy the 

reliability condition: lim Fr[Si ^ S^] = 0, (13) 



^ An alternative approach is given in [12, 1-'?], where the polarization phenomenon has been generalized for arbitrary 

finite fields. We will however focus on the multilevel construction in this paper. 
■^ The correlation of the random variables (X, Y, Z) is described by its joint probability distribution Px,y,z- 



and secure, i.e., satisfy the 



(strong) secrecy condition: lim Poj z^ n ~ ^s' ^ ^z^ c — ^^ 0-^) 

N-^OO A' ' A 'I 



lim -/(5i;Z^,C)=0. (15) 



where Px denotes the uniform distribution on random variable X. 

Historically, secrecy was first characterized by a (weak) secrecy condition of the form 

1 

Maurer and Wolf showed that (15) is not a sufficient secrecy criterion [14, 15] and introduced the 
strong secrecy condition 

lim /(5i;Z^,C7) =0, (16) 

where in addition it is required that the key is uniformly distributed, i.e., 

lunj{p,^,Ps.;)=0. (17) 

In recent years, the strong secrecy condition (16) has often been replaced by (14), since (half) 
the Li distance directly bounds the probability of distinguishing the actual key produced by the 
protocol with an ideal key. This operational interpretation is particularly helpful in the finite 
blocklength regime. In the limit A^ — )• oo, the two secrecy conditions (14) and (16) are equivalent, 
which can be shown using Pinskser's and Fano's inequalities. 

Since having weak secrecy is not sufficient, we will only consider strong secrecy in this paper. 
It has been proven that each secret-key agreement protocol which achieves weak secrecy can be 
transformed into a strongly secure protocol ['']. However, it is not clear whether the resulting 
protocol is guaranteed to be practically efficient. 

For one-way communication, Csiszar and Korner [ ] and later Ahlswede and Csiszar [16] showed 
that the optimal rate R := lim^v-i-oo jj of generating a secret key satisfying (13) and (16), called 
the secret-key rate S^{X;Y\Z), is characterized by a closed single-letter formula. 

Theorem 2 ([4, 16]). For triples {X,Y,Z) described by Px,Y,z o-s explained above, 

'max H{U\Z,V)-H{U\Y,V) 

S^{X;Y\Z) = ''",1 V^U^X^{Y,Z), (18) 

|V| < lA-i, 1^1 < l^p. 

The expression for the one-way secret-key rate given in Theorem 2 can be simplified if one 
makes additional assumptions about Px,Y,z- 

Corollary 3. For Px,Y,z such that the induced DM-WTC\N described by Py^zix ^-^ more capable, 

'max H{X\Z,V)-H{X\Y,V) 

si V^^X^^{Y,Z), (19) 

\v\<\x\. 

Proof. In terms of the mutual information, we have 

H{U\Z,V)-H{U\Y,V) = I{U;Y\V)-I{U;Z\V) (20) 

= I{X,U;Y\V) - I{X,U;Z\V) - {I{X;Y\U,V) - I{X;Z\U,V)) (21) 

<I{X,U;Y\V)-I(X,U;Z\V) (22) 

= I{X;Y\V)-I{X;Z\V), (23) 

using the chain rule, the more capable condition, and the Markov chain properties, respectively. 
Thus, the maximum in S^{X; Y\Z) can be achieved when omitting U. D 



S^{X;Y\Z)={ 



Corollary 4. For Px,Y,z such that the induced DM-WTC\N described by Py,z\x ^^ ^^ss noisy, 

S^{X;Y\Z) = H{X\Z)-H{X\Y). (24) 

Proof. Since W being less noisy implies W being more capable, we know that the one-way secret 
key rate is given by (19). Using the chain rule we obtain 

H{X\Z,V) - H{X\Y,V) = I{X;Y\V)-I{X;Z\V) (25) 

= I{X, V; Y) - I{X, V; Z) - I{V; Y) + I{V; Z) (26) 

= I{X- Y) - I{X; Z) - {I{V; Y) - I{V; Z)) (27) 

<I{X-Y)-I{X-Z). (28) 

Equation (27) follows from the chain rule and the Markov chain condition. The inequality uses the 
assumption of being less noisy. D 

Note that (24) is also equal to the one-way secret-key rate for the case where W is degraded, as 
this implies W being less noisy. The proof of Theorem 2 does not imply that there exists an efficient 
one-way secret-key agreement protocol. A computationally efficient scheme was constructed in [17], 
but is not known to be practically efficient.'^ 

For key agreement with two-way communication, no formula comparable to (18) for the optimal 
rate is known. However, it has been shown that the two-way secret-key rate is strictly larger 
than the one-way secret-key rate. It is also known that the intrinsic information I{X; Y^ Z) := 
minp^, I{X;Y\Z') is an upper bound on S{X;Y\Z), but is not tight [16, 18, 19]. 

D. Private Channel Coding 

Private channel coding over a wiretap channel is closely related to the task of one-way secret- 
key agreement from common randomness (cf. Section HE). Here Alice would like to transmit a 
message M"^ G M"^ privately to Bob. The messages can be distributed according to some arbitrary 
distribution P]\jj- To do so, she first encodes the message by computing X'^ = enc(M'^) for some 
encoding function enc : A^"^ — t- X^ and then sends X^ over the wiretap channel to Bob (and to 
Eve), which is represented by (Y , Z ) = \N X . Bob next decodes the received message to 
obtain a guess for Alice's message M'^ = dec(y'^) for some decoding function dec : 3^^ — )• Ai"^ . 
As in secret-key agreement, the private channel coding scheme should be reliable, i.e., satisfy the 

rehability condition: lim Pr Tm"' / mA = (29) 

J— >oo L J 

and (strongly) secure, i.e., satisfy the 

(strong) secrecy condition: lim Iji-jv/-^ z^ c ~ ^MJ ^ -Tz^ cIL — ^- (^0) 

The variable C denotes any additional information made public by the protocol. 

As mentioned in Section II C, in the limit J — t- oo this strong secrecy condition is equivalent to 
the historically older (strong) secrecy condition 

lim I(M-^;Z^,C) =0. (31) 

J^-oo 

The highest achievable rate R := limiv_j.oo -^ fulfilling (29) and (30) is called the secrecy capacity. 
Csiszar and Korner showed [4, Corollary 2] that there exists a single-letter formula for the 
secrecy capacity^ 



^ As defined in Section I, we call a scheme practically efficient if its computational complexity is essentially linear 

in the blocklength. 
* Maurer and Wolf showed that the single-letter fornmla remains valid considering strong secrecy [15]. 



Theorem 5 ([']). For an arbitrary DM WTC\N as introduced above, 



' max iI{y\Z) -li{y\Y) 
Pv,x 

s.t. V^^X^^{Y,Z), (32) 

IVI < \x\. 



This expression can be simphfied using additional assumptions about W. 
Corollary 6 ([ ]). //W is more capable, 

Cs = H{X\Z)-H{X\Y). (33) 

Proof. A proof can be found in [ ] or [:.•., Section 22.1]. D 

E. Previous Work and Our Contributions 

In Section III, we present a one-way secret-key agreement scheme based on polar codes that 
achieves the secret-key rate, is strongly secure, reliable and whose implementation is practically 
efficient, with complexity O(A^logA^) for blocklength N. Our protocol improves previous efficient 
secret-key constructions [2x], where only weak secrecy could be proven and where the eavesdropper 
has no prior knowledge and/or degradability assumptions are required. However, we are not able 
to give an efficient algorithm for code construction. 

In Section IV, we introduce a coding scheme based on polar codes that provably achieves the 
secrecy capacity for arbitrary discrete memoryless wiretap channels. We show that the complexity 
of the encoding and decoding operations is 0(A^log A^) for blocklength A^. Our scheme improves 
previous work on practically efficient private channel coding at the optimal rate [,-:], where only 
weak secrecy could be proven under the additional assumption that the channel W is degraded.^ 
Recently, Bellare et al. introduced an efficient coding scheme that is strongly secure and achieves 
the secrecy capacity for binary symmetric wiretap channels [23].^ Several other constructions of 
private channel coding schemes have been reported [24-26], but all achieve only weak secrecy. 

The tasks of one-way secret-key agreement and private channel coding explained in the previous 
two subsections are closely related. Maurer showed how a one-way secret-key agreement can be 
derived from a private channel coding scenario [2]. More precisely, he showed how to obtain the 
common randomness needed for one-way secret-key agreement by constructing a "virtual" degraded 
wiretap channel from Alice to Bob. This approach can be used to obtain the one-way secret-key 
rate from the secrecy capacity result in the wiretap channel scenario [20, Section 22.4.3]. 

One of the main advantages of the two schemes introduced in this paper is that they are both 
practically efficient. However, even given a practically efficient private coding scheme, it is not 
known that Maurer's construction will yield a practically efficient scheme for secret key agreement. 
For this reason, as well as simplicity of presentation, we treat the one-way secret-key agreement 
and the private channel coding problem separately in the two sections to follow. 

III. ONE-WAY SECRET-KEY AGREEMENT SCHEME 

Our key agreement protocol is a concatenation of two subprotocols, an inner and an outer 
layer, as depicted in Figure 1. The protocol operates on blocks of A^ i.i.d. triples (A, Y, Z), which 

^ Note that Mahdavifar and Vardy showed that their scheme achieves strong secrecy if the channel to Eve (induced 

from W) is noiseless. Otherwise their scheme is not provably reliable [--]. 
^ They claim that their scheme works for a large class of wiretap channels. However, this class has not been 

characterized precisely so far. It is therefore not clear wether their scheme requires for example degradability 

assumptions. Note that to obtain strong secrecy for an arbitrarily distributed message, it is required that the 

wiretap channel is symmetric [_'•), Lemmal4]. 



are divided into M sub-blocks of size L for input to the inner layer. In the following we assume 
X = {0, 1}, which however is only for convenience; the techniques of [ '] and [ ] can be used to 
generalize the schemes to discrete memoryless wiretap channels with arbitrary input size. 

The task of the inner layer is to perform information reconciliation and that of the outer layer 
is to perform privacy amplification. Information reconciliation refers to the process of carrying out 
error correction to ensure that Alice and Bob obtain a shared bit string, and here we only allow 
communication from Alice to Bob for this purpose. On the other hand, privacy amplification refers 
to the process of distilling from Alice's and Bob's shared bit string a smaller set of bits whose 
correlation with the information available to Eve is below a desired threshold. 

Each subprotocol in our scheme is based on the polarization phenomenon. For information 
reconciliation of Alice's random variable X^ relative to Bob's information Y^ , Alice applies a polar 
transformation to X^ and forwards the bits of the complement of the deterministic set P^^(X|y) 
to Bob over a insecure public channel, which enables him to recover X using the standard polar 
decoder [5]. Her remaining information is then fed into a multilevel polar transformation and the 
bits of the random set are kept as the secret key. 

Let us now define the protocol more precisely. For L = 2 , £ G Z"*", let V = GlX where Gl 
is as defined in (1). For ei > 0, we define 



£k:=V^^{X\Y), 



(34) 



with K 

i = 1,- 



-- \Vl^^{X\Y)\. Then, let r(,) = V^[Sk]j for j 
,L — K so that T = (r(i), . . . ,T(/^)) and C = 



= l,...,K and C(j) = V^[£'^]j for 
(C(i),...,C(L_x))- For €2 > and 



Uf"^ = GmTJ^^ for j = 1, . . . K (or, more briefly, C/*^ = G^,T^'), we define 



^(i) 



'M' 



Fj := nfAT\CZ^ 



(35) 



with J 



\RMiT\CZ^) 



Protocol 1: One-way secret-key agreement 



Given: Index sets £k and J-j (code construction) 
Notation: Alice's input: a;^ € Z|^ (a realization oi X^) 

Bob's / Eve's input: (y^, z^) (realizations of Y^ and Z'^) 

Alice's output: s^ 

Bob's output: s^g 



Step 1 
Step 2 
Step 3 
Step 4 
Step 5 

Step 6: 



Alice computes u*|f = GiX^^lif for all i G {0, L, 2L, 
Alice computes ti — w-Jf [fit] for all i € {0, L, 2L, . . 



...,{M-l)L}. 
.,{M-1)L}. 



Alice sends Ci = v^'^^ [S'^] for all i G {0, L, 2L, . . . , (M — l)L} over a public channel to Bob. 



,M 



i^J 



Alice computes u*^ 

Bob applies the standard polar decoder ["i, I I] to {ci,y1^^) to obtain wj^f" and 



G^t^^ and obtains s^ 



t, 



r,i+L 



"i+ 



f[fif],foriG{0,L,2L 

K 

M 



. .,(Af- l)L). 



Bob computes u*^ = G^' +*^ 



t^" and obtains s;^ = ^ [Fj] . 



^ The expression u'''[Tj 



is an abuse of notation, as J-'j is not a subset of [M]. The expression should be understood 



to be the union of the random bits of ij/.j, for all j = 1, . . . , 7^, as in the definition of TZ^^ (T\CZ 



Si^U'^'^lFj] 



_Ci 

Source 



V^[£^k] 



PA 



< — f^K 



V^[£k] 



T/ML re 1 



T/MLr<7c 1 



TA 



IR 



IR 



C, 



dec 



v^^[£: 



K\ 



dec 



^/lil^i^] 






U^^''[Fj] = Si 



IjKM^-pc-^ 



TB 



FIG. 1. The secret-key agreement scheme for the setup N = 8, L = 4, AI = 2, K ^ 2, and J — 2. We 
consider a source that produces N i.i.d. copies (X''^, F''^, Z^) of a triple of correlated random variables 
{X, Y, Z). Alice performs the operation ta, sends {V^[£^])^ over a public channel to Bob and obtains S^, 
her secret key. Bob then performs the operation tb which results in his secret key 5^. 



A. Rate, Reliability, Secrecy, and Efficiency 

Theorem 7. Protocol 1 allows Alice and Boh to generate a secret key Sj^ respecitvely S'^ using 
public one-way communication C^^ such that for /3 < -^ : 



Reliability: Fr[Si / S^] = o(m2 



-L" 



Secrecy: 
Rate: 



P. 



Psi X Py 



, ^ iv/3 

0[ VN2-— 



R 



^ = H{X\Z)-^H{V^[£^^\\Z^] 



N 



(36) 

(37) 
(38) 



All operations by both parties may be performed in 0{N log N) steps. 

Proof. The reliability of Alice's and Bob's key follows from the standard polar decoder error prob- 
ability and the union bound. Each instance of the decoding algorithm employed by Bob has an 
error probability which scales as 0(2~^ ) for /3 < 2 [''"']) application of the union bound gives the 
prefactor M. 

To prove the secrecy statement requires more effort. Using Pinsker's inequality we obtain 



HPsi,zN,c^^^Ps-l >^Pzn,cm] < y^P'{Psi,z^,c'" 



In 2 



Pot X p. 






ZN^C^t 



lf(j-H{Si\Z^,CM)), 



(39) 
(40) 



where the last step uses the chain rule for relative entropies and that Pgj denotes the uniform 



distribution. We can simplify the conditional entropy expression using the chain rule 



i/(siz^,c*^) 








= H{u'''[Tj]Z'',iV'^[£M)'') 


(41) 




C/5[J-(i)], . . . , ^(f_,)[-F(,_i)], Z^, (v^m)'') 


(42) 




u^)\^U)^-\ ^Sl-^d)], • • • , c/(f.i)[-F(,_i)], z^, (y^[£:^])^'^) 


(43) 


3=1 ieTj 


ui^\ c/S[-F(i)], . . . , ul;U)[^u-i)], z"", {v^[£M)^') 


(44) 


>Jil-e2), 






(45) 



where the first inequality uses the fact that that conditioning reduces entropy and the sec- 
ond inequality follows by the definition of J-j. Recall that we are using the notation in- 
troduced in Section II B. For Tj as defined in (35), we have Tj = {J^[j)} ■_, where J^(j) = 
7^^ (Tfj) \Tij_i\, . . . ,Tn\,C, Z^). The polarization phenomenon, Theorem 1, implies J = 0{N), 
which together with (40) proves the secrecy statement of Theorem 7, since €2 = 0{2^^ ) for 
/3<i. 

The rate of the scheme is 

R=\^ (46) 

= ^H{V^[£k]\V^[£M,Z^) - ^ (47) 

= i {H{V^\Z^) - H{V^[£M\Z^)) - ^ (48) 

= H{X\Z) - ^Hiv'^rxp^) - ^, (49) 

where (47) uses the polarization phenomenon stated in Theorem 1. 

It remains to show that the computational complexity of the scheme is O(A^logA^). Alice 
performs the operation Gl in the first layer M times, each requiring 0(L log L) steps [)]. In 
the second layer she performs G^, or K parallel instances of Gm, requiring O [KM log M) total 
steps. From the polarization phenomenon, we have K = 0{L), and thus the complexity of Alice's 
operations is not worse than 0(A^log A^). Bob runs M standard polar decoders which can be done 
in O (ML log L) complexity [ :, . .]. Bob next performs the polar transform G^, whose complexity 
is not worse than 0(A^log A) as justified above. Thus, the complexity of Bob's operations is also 
not worse than 0( A log A). D 



In principle, the two parameters L and M can be chosen freely. However, to maintain the 
reliability of the scheme (cf.(36)), M may not grow exponentially fast in L. A reasonable choice 

M 
L 



would be to have both parameters scale comparably fast, i.e., -j- = 0(1). 



Corollary 8. The rate of Protocol 1 given in Theorem 7 can he hounded as 

R>maxlo,H{X\Z)-H{X\Y)--^\ . (50) 
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Proof. According to (49) the rate of Protocol 1 is 

R = H{X\Z) - ^H{V'^[£M\Z'^) - ^ (51) 

>ma.{o,i7(X|Z)-^-^} (52) 

= max|o,i7(X|Z) - H{X\Y) - ^| , (53) 

where (53) uses the polarization phenomenon stated in Theorem 1. D 

B. Achieving the Secret-Key Rate 

Theorem 7 together with Corollaries 4 and 8 immediately imply that Protocol 1 achieves the 
secret-key rate S^{X; Y\Z) if Px,Y,z is such that the induced DM WTP W is less noisy. If we can 
solve the optimization problem (18), i.e., find the optimal auxiliary random variables V and U, our 
one-way secret-key agreement scheme can achieve S^{X; Y\Z) for a general setup. We then make 

V public, replace X hy U and run Protocol 1. Note that finding the optimal random variables 

V and U might be difficult. It has been shown that for certain distributions the optimal random 
variables V and U can be found analytically [ " ] . 

Two open problems discussed in Section V address the question if Protocol 1 can achieve a rate 
that is strictly larger than iQSix{0,H(X\Z) — H{X\Y)} if nothing about the optimal auxiliary 
random variables V and U is known, i.e., if we run the protocol directly for X without making V 
public. 

C. Code Construction 

Before the protocol starts one must construct the code, i.e. compute the index sets £k and 
J- J. The set £k can be computed approximately with a linear-time algorithm introduced in [28], 
given the distributions Px and Py\x- Alternatively, Tal and Vardy's older algorithm [29] and its 
adaption to the asymmetric setup [11] can be used. 

To compute the outer index set J^j even approximately requires more effort. In principle, we can 
again use the above algorithms, which require a description of the "super-source" seen by the outer 
layer, i.e. the source which outputs the triple of random variables (^■'"[fic-], {Y^, V^[£^]), (Z^, ^'^[i?^])). 
However, its alphabet size is exponential in L, and thus such a direct approach will not be efficient 
in the overall blocklength A^. Nonetheless, due to the structure of the inner layer, it is perhaps 
possible that the method of approximation by limiting the alphabet size [28, 29] can be extended to 
this case. In particular, a recursive construction motivated by the decoding operation introduced 
in [ ] could potentially lead to an efficient computation of the index set J-j. 

IV. PRIVATE CHANNEL CODING SCHEME 

Our private channel coding scheme is a simple modification of the secret key agreement protocol 
of the previous section. Again it consists of two layers, an inner layer which ensures transmitted 
messages can be reliably decoded by the intended receiver, and an outer layer which guarantees 
privacy from the unintended receiver. The basic idea is to simply run the key agreement scheme in 
reverse, inputting messages to the protocol where secret key bits would be output in key agreement. 
The immediate problem in doing so is that key agreement also produces outputs besides the secret 
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key, so the procedure is not immediately reversible. To overcome this problem, the encoding 
operations here simulate the random variables output in the key agreement protocol, and then 
perform the polar transformations G^ and Gl in reversed 

The scheme is visualized in Figure 2 and described in detail in Protocol 2. Not explicitly shown 
is the simulation of the bits U^^[Fj] at the outer layer and the bits V^[£'^] at the inner layer. The 
outer layer, whose simulated bits are nearly deterministic, makes use of the method described in 
[30, Definition 1], while the inner layer, whose bits are nearly uniformly-distributed, follows [11, 
Section IV]. Both proceed by successively sampling from the individual bit distributions given all 
previous values in the particular block, i.e., constructing Vj by sampling from Py.\yj-\. These 
distributions can be efficiently constructed, as described in Section IV C. 

Note that a public channel is used to communicate the information reconciliation information 
to Bob, enabling reliable decoding. However, it is possible to dispense with the public channel and 
still achieve the same rate and efficiency properties, as will be discussed in Section IV C. 

In the following we assume that the message M to be transmitted is uniformly distributed 
over the message set M = {0,1} . As mentioned in Section II D, it may be desirable to have 
a private coding scheme that works for an arbitrarily distributed message. This can be achieved 
by assuming that the wiretap channel W is symmetric — more precisely, by assuming that the two 
channels Wi : X ^ y and W2 : X ^ Z induced by W are symmetric. We can define a super- 
channel W : T — >■ y X Z X C which consists of an inner encoding block and L basic channels W.^ 
The super-channel W again induces two channels W'^ : T — ?■ y x C and Wg : T — )• Z x C. Arikan 
showed that Wi respectively W2 being symmetric implies that W^ respectively W2 is symmetric [ ), 
Proposition 13]. It has been shown in [22, Proposition 3] that for symmetric channels polar codes 
remain reliable for an arbitrary distribution of the message bits. We thus conclude that if Wi is 
assumed to be symmetric, our coding scheme remains reliable for arbitrarily distributed messages. 
Assuming having a symmetric channel W2 implies that Wg is symmetric which proves that our 
scheme is strongly secure for arbitrarily distributed messages.^ 

Protocol 2: Private channel coding 

Given: Index sets £k and F,j (code construction)^'' 

Notation: Message to be transmitted: ra'^ 

Outer encoding: Let u'^^[Fj] — m''^^ and u*^[J-'j] = j-KM-j -^^j^gj-g ^km~j jg (j-andomly) generated 

as explained in [ , Definition 1]. Let t^^ — G^^u^ . 
Inner encoding: For all i e {0, L, . . . , L{M — 1)}, Alice does the following: let v'^jX^ [^k] = t{i/L)+i 

and vlj^i [f ^] = sl^i^^ where s^^f ~^ is (randomly) generated as explained in 

[11, Section IV]. Send C(i/x)+i '■— s^+i ^ ^ over a public channel to Bob. Finally, 

compute xll^ =GlvII^. 
Transmission: {y^ ,z^) —\N^ x^ 



Inner decoding: Bob uses the standard decoder [':, ' ] with inputs C(i/L)+i and yl^i to obtain 
and hence i/i/n+i = ^'i+f [^k], for each i G {0, i, . . . , L{M — 1)}. 



i+i J 



'M 



Outer decoding: Bob computes u*^ — Gf^t'^^ and outputs a guess for the sent message m'^ = u^'^lTj]. 



^ As it happens, Gl is its own inverse. 

* This super-channel is explained in more detail in Section V B. 

^ This can be seen easily by the strong secrecy condition given in (30) using that W2 is symmetric. 
^° By the code construction the channel input distribution Px is defined. Px should be chosen such that it maximizes 

the scheme's rate. 
^^ Again an abuse of notation. See the Footnote 6 of Protocol 1. 
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FIG. 2. The private channel coding scheme for the setup iV = 8, L = 4, Af = 2, iiT = 2, and J — 2. 
The message M'^ is first sent through an outer encoder which adds some bits (simulated as explained in 
[11, Section IV]) and applies the polarization transform G^^. The output T*^ — (^(i), ■ • ■ ,T(k))'^^ is then 
encoded a second time by M independent identical blocks. Note that each block again adds redundancy 
(as explained in [>S(), Definition 1]) before applying the polarization transform Gl- Each inner encoding 
block sends the frozen bits over a public channel to Bob. Note that this extra public communcation can be 
avoided as justified in Section IV C. The output X^ is then sent over N copies of the wiretap channel W to 
Bob. Bob then applies a decoding operation as in the key agreement scheme, Section III. 



A. Rate, Reliability, Secrecy, and Efficiency 



Corollary 9. For any /3 < j; Protocol 2 satisfies 



Reliability: Pr 



P, 



M-^ ^M'' 



O M2 



-LP 



mj,z^,c 



p 



Secrecy: 
Rate: 
and its computational complexity is 0{N log N). 



mj 



1 



xP. 



z'^,c\ 



, ^- nP 

0[ VN2- — 



R = H{X\Z)-^H{V^[£],]\Z^] 



N 



(54) 
(55) 

(56) 



Proof. Recall that the idea of the private channel coding scheme is to run Protocol 1 backwards. 
Since Protocol 2 simulates the nearly deterministic bits U^'^[J-j] at the outer encoder as described 
in [30, Definition 1] and the almost random bits V [£^] at the inner encoder as explained in 
[11, Section IV], it follows that for large values of L and M the private channel coding scheme 



6 Ptm,Pi 



ooOl -ryA/,-r(yL[£-^^,])M 



and 



approximates the one-way secret-key scheme setup, ^■^ i.e., limiv 

lim/,^00 ^{Px^ ' ^x'-) ~ *-* ^'^d, where Pxl denotes the distribution of the vector X^ which is sent 
over the wiretap channel W and Pj^l denotes the distribution of Alice's random variable X^ in 
the one-way secret-key agreement setup. We thus can use the decoder introduced in [ ] to decode 
the inner layer. Since we are using M identical independent inner decoding blocks, by the union 



This approximation can be made arbitrarily precise for sufficiently large values of L and M. 
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bound we obtain the desired reliability condition. The secrecy and rate statement are immediate 
consequences from Theorem 7. 

D 

As mentioned after Theorem 7, to ensure reliability of the protocol, M may not grow exponen- 
tially fast in L. 

Corollary 10. The rate of Protocol 2 given in Corollary 9 can be bounded as 

i?>max|o,i7(X|Z)-i7(X|y)-^^|. (57) 

Proof. The proof is identical to the proof of Corollary 8. D 

B. Achieving the Secrecy Capacity 

Corollaries 6 and 10 immediately imply that our private channel coding scheme achieves the 
secrecy capacity for the setup where W is more capable. If we can find the optimal auxiliary 
random variable V in (32), Protocol 2 can achieve the secrecy capacity for a general wiretap 
channel scenario. We define a super-channel W : V — )• 3^ x 2^ which includes the random variable 
X and the wiretap channel W. The super-channel W is characterized by its transition probability 
distribution Py,z\v where V is the optimal random variable solving (32). The private channel 
coding scheme is then applied to the super-channel, achieving the secrecy capacity. Note that 
finding the optimal random variable V might be difficult. 

In Section V, we discuss the question if it is possible that Protocol 2 achieves a rate that is 
strictly larger than max{0,-ff(X|Z) — H{X\Y)}, if nothing about the optimal auxiliary random 
variable V is known. 



C. Code Construction & Public Channel Communication 

To start the private channel coding scheme the code construction has to be done. Therefore, 
the index sets £k and Fj as defined in (34) and (35) need to be computed. This can be done as 
explained in Section IIIC. The code construction defines the input distribution Px to the wiretap 
channel, which should be chosen such that it maximizes the scheme's rate given in (56). 

We next explain how the communication C G C from Alice to Bob can be reduced such 
that it does not affect the rate, i.e., we show that we can choose \C\ = o{L). Recall that we 
defined the index set £k '■= ^i'i(A"|^) in (34). Let Q := lZl^^{X\Y) using the noation introduced 
in (2) and X := [L\\{£k ^ G) = £k\^- ^^ explained in Section II B, Q consists of the outputs Vj 
which are essentially uniformly random, even given all previous outputs V^~^ as well as Y , where 
V^ = GlX^ . The index set X consists of the outputs Vj which are neither essentially uniformly 
random nor essentially deterministic given V^~^ and Y^ . The polarization phenomenon stated 
in Theorem 1 ensures that this set is small, i.e., that |X| = o(L). Since the bits of Q are almost 
uniformly distributed, we can fix these bits independently of the message — as part of the code 
construction — without affecting the reliability of the scheme for large blocklengths.^'^ We thus 
only need to communicate the bits belonging to the index set X. 

We can send the bits belonging to X over a seperate public noiseless channel. Alternatively, we 
could send them over the wiretap channel W that we are using for private channel coding. However 



^^ Recall that we choose ei = O ( 2 ^ J for /3 < i , such that for L — > oo the index set Q contains only uniformly 



distributed bits. 
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since W is assumed to be noisy and it is essential that the bits in X are recieved by Bob without 
any errors, we need to protect them using an error correcting code. To not destroy the essentiahy 
Unear computational complexity of our scheme, the code needs to have an encoder and decoder 
that are practically efficient. Since \T\ = o{L), we can use any error correcting code that has a 
non-vanishing rate. For symmetric binary DMCs, polar coding can be used to transmit reliably 
an arbitrarily distributed message [22, Proposition 3]. We can therefore symmetrize our wiretap 
channel W and use polar codes to transmit the bits in l}^ 

As the reliability of the scheme is the average over the possible assignments of the random bits 
belonging to I (or even £j^), at least one choice must be as good as the average, meaning a reliable, 
efficient, and deterministic scheme must exist. However, it might be computationally hard to find 
this choice. 



V. DISCUSSION 

In this section, we describe two open problems, both of which address the question of whether 
rates beyond uia,x{0,H{X\Z) — H{X\Y)} can be achieved by our key agreement scheme, even 
if the optimal auxiliary random variables V and U are not given, i.e., if we run Protocol 1 di- 
rectly for X (instead of U) without making V public. It may be even possible that the key 
agreement scheme achieves the optimal rate; no result to our knowledge implies otherwise. The 
two questions could also be formulated in the private coding scenario, whether rates beyond 
max {0, max p^ H{X\Z) — H{X\Y)} are possible, but as positive answers in the former context 
imply positive answers in the latter, we shall restrict attention to the key agreement scenario for 
simplicity. 

A. Polarization with Bob's or Eve's Side Information 

Question 1. Does for some distributions Px,Y,z the rate of Protocol 1 satisfy 

R> max{0, H{X\Z)-H{X\Y)}, for N ^ oo7 (58) 

An equivalent formulation of this question is whether inequality (52) is always tight for large 
enough A'", i.e., 

Question 1'. Is it possible that 

lim ^H(V^[£^k]\Z^) < lim ^\£j^\, for R > 0? (59) 

Using the polarization phenomenon stated in Theorem 1 we obtain 

hm \\£'i,\=H{X\Y), (60) 

which together with (59) would imply that R > max{{), H {X\Z) - H{X\Y)] for A^ — )• oo is 
possible. Relation (59) can only be satisfied if the high-entropy set with respect to Bob's side 
information, i.e., the set £'^, is not always a high-entropy set with respect to Eve's side information. 
Thus, the question of rates in the key agreement protocol is closely related to fundamental structural 
properties of the polarization phenomenon. 



^* Note that the symmetrization of the channel will reduce its rate which however does not matter as we need a 
non- vanishing rate only. 
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For less noisy channels W defined by Pyz\x (cf- Section II A), these questions can be answered 
in the negative. In this case we have H{X \Z ) > H{X \Y ), and since V [£^] is a deterministic 
function of X , 

hm jHiv'^iS^j.p^) > hm ii/(y^[f|^]|y^) = hm j |f^| . (61) 

Thus, (59) cannot hold. The final equality can be justified as follows. Recall that we defined £k '■= 
V^^{X\Y) in (34). Let Hl-k ■= n^^{X\Y) and X := [L]\(£:/^ U^l_k) such that -S^ = Hl-k^J^- 
Recall that we can choose ei = 0{2~^ ^) for /3i < g- Using the chain rule and the polarization 
phenomenon given in Theorem 1, we obtain 

hm ii/(y^[f^]|y^) = hm J Y. H{v'^[£Mi\v'^[£M'-\y'') (62) 

> lim ]-{{l-ei)\nL-K\+ei\I\) (63) 

= liml|f^|. (64) 

-L— 7>00 Li 

Using the upper bound of the entropy in terms of the alphabet size we conclude that the 
equality in (61) holds. The fact that (59) is not possible in the setup where W is less noisy 
accords with the one-way secret-key rate formula given in (24), which excludes rates beyond 
max{0,i7(X|Z) - H{X\Y)}. 

If the answer to Question 1, or equivalently to Question 1', is "yes", this would give some new 
insights into the problem of finding the optimal auxiliary random variables U, V in (18) (and V in 
(32)), which may be hard in general. 

Furthermore, a positive answer to Question 1 implies that we can send quantum information 
reliable over a quantum channel at a rate that is beyond the coherent information using the scheme 
introduced in [ ] . Since the best known achievable rate for a wide class of quantum channels is the 
coherent information, our scheme would improve this bound. Furthermore, it would be of interest 
to know by how much we can outperform the coherent information.^^ 

B. Approximately Less Noisy Super-Channel 

To state the second open problem, consider the super-source which outputs the triple of ran- 
dom variables (^■^[(S'a'], {Y^, y^i^x])^ {^^^ ^^[^k]))- ^'^^ instance. Figure 1 consists of two super- 
sources. The super-source implicitly defines a super-channel W using the conditional probability 
distribution of the second two random variables given the first. Then we have 

Proposition 11. For sufficiently large L, the channelVJ' is approximately less noisy, irrespective 
o/W. 

Proof. Using the chain rule we can write 

H{v'^[£k]\v'^[£M,Y'^) = Y. H{v'^[£K]^\v'^[£Kr\v'^[£M,Y'') (65) 

< ^i7(y,|y^-i,y^) (66) 

< Kei, (67) 



^® Since there exist a lot of good converse bounds for sending quantum information reliable over an arbitrary quantum 
channel ['51-33], it would be interesting to see how closely they can be met. 
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where the last inquahty follows by definition of the set £k- Recall that we can choose ei = O ( 2~^ J 

for /3 < 2- The polarization phenomenon stated in Theorem 1 ensures that K = O (L). Hence, we 
can apply the following Lemma 12 which proves the assertion. D 

Lemma 12. // U^^X^^iY^Z) form a Markov chain in the given order and H{X\Y) < e for 
e > 0, then H{U\Y) < H{U\Z) + e for all possible distributions of {U,X). 

Proof. Using the chain rule and the non-negativity of the entropy we can write 

H{U\Y)<H{U\Y) + H{X\Y,U) (68) 

= H{U,X\Y) (69) 

= H{X\Y) + HiU\X,Y) (70) 

<e + H{U\X) (71) 

<e + H{U\Z). (72) 

Inequality (71) follows by assumption and since conditioning reduces entropy. The final inequality 
uses the data processing inequality. D 

Proposition 11 and Lemma 12 imply that the DM-WTC W induced by the super-source de- 
scribed above is almost less noisy. More precisely we have for /3 < ^ and £, = O ( L2^^ 

H{T\V^[£M,y^) < H{T\V^[SM,Z^)+C, (73) 

for ah possible distributions of T, where T-<y~V^[£ic]-o-{{Y^ ,V^[£^]), {Z^,V^[£^])) and \T\ < 
K. Following the proof of Corollary 4 — using (73) in (28) — we obtain the one-way secret-key rate 
of the super-source as 

i5^(F^[£:i,];y^y^[£:^]|z^y^[f^]) 

= i {H{V^[£K]\z\v'^[£l:])-H{V^[£K]\Y\v'^[£^K])+e, (74) 

= \{H{V^[£K]\z\V^[£'i,]))-°-^ (75) 

= R. (76) 

The second equation follows by definition of the set £k and (76) is according to (47). We thus 
conclude that the one-way secret-key agreement scheme introduced in Section III always achieves 
the one-way secret-key rate for the super-source as defined above. This raises the question of when 
the super-source has the same key rate as the original source, i.e., how much is is lost in the first 
layer of our key agreement scheme. 

Question 2. For what conditions does ^S^{V^[£k];Y^ ,V^[£1^]\Z^ ,V^[£1^]) = S^{X-Y\Z) 
hold? 

Having ^S^{V^[£k]]Y^ ,V^[£'i^]\Z^ ,V^[£''k\) = S^{X;Y\Z) implies that Protocol 1 achieves 
the one-way secret-key rate without knowing anything about the optimal auxiliary random variables 
V and U. If W is less noisy. Corollary 4 ensures that {S^{V^[£k]\Y^ ,V^[£''^]\Z^ ,V^[£''^]) = 
S-y{X;Y\Z) must be satisfied. For other scenarios Question 2 is currently unsolved. 
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For the setup of private channel coding, following the proof of Corollary 6 using (73) shows that 
the secrecy capacity of the super-channel W is 

C7,(W') = i {H{V^[Sk]\Z'^,V^[£^j,]) - H{V^[£k]\Y^V^[£^k]) + ^) (77) 

= i {H{V^[£k]\Z^,V^[£I,])) - ^ (78) 

= R. (79) 

The scheme introduced in Protocol 2 hence achieves the secrecy capacity for the channel W ir- 
respective of the channel W. This raises the question when the super-channel and the original 
channel have the same secrecy capacity. 

Question 2'. Under what conditions does Cs(W') = Cs(W) hold? 

Cs(W') = Cs(W) being valid implies that Protocol 2 achieves the secrecy capacity of W without 
having knowledge about the optimal auxiliary random variable V. If W is more capable, according 
to Corollary 6 Cs(V\/') = Cs(W) must hold. For other channels, Question 2' has not yet been 
resolved. 



C. Conclusion 

We have constructed practically efficient protocols (with complexity essentially linear in the 
blocklength) for one-way secret-key agreement from correlated randomness and for private channel 
coding over discrete memoryless wiretap channels. Each protocol achieves the corresponding op- 
timal rate. Compared to previous methods, we do not require any degradability assumptions and 
achieve strong (rather than weak) secrecy. 

Our scheme is formulated for arbitrary discrete memoryless wiretap channels. Using ideas of 
§a§oglu et al. [:•] the two protocols presented in this paper can also be used for wiretap channels 
with continuous input alphabets. 
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